Topics |
|
The B-tree cache is comprised of a collection of buckets used to store file based objects. The cache functions as a buffer for a copy of the object stored in memory and the address of the object in the file. Buckets are used to keep file addresses and memory buffers paired together. When a request is made to the cache for a particular object by its file address, the cache will read the object from the disk and store it in a bucket that resides in memory. If the object is already in a bucket the cache will not read it from the disk. After the appropriate bucket is reserved, a pointer to the bucket is returned.
When the cache fills up and another bucket is needed, the data in one of the occupied buckets is flushed back to disk so the bucket can be reused. Only buckets that have been modified since they were last loaded are flushed to disk. If a bucket has not been modified it is reused without being flushed. Buckets are reference counted to ensure that there are no open references to a bucket before it is reused. The cache classes implement cache pointers that reference cache buckets directly, with each cache pointer being initialized after a bucket is reserved. Buckets are only reorganized when a bucket is reserved.
The Bucketb class is implemented independently of the data stored to provide the maximum code sharing. It stores pointers to the objects rather then the objects themselves. The cache bucket class is derived from the Bucketb class, creating a family of classes specific to the data type stored. Both of the cache bucket classes are used to keep addresses and memory buffers paired together, to prevent a file based object's address from being associated with the wrong memory buffer. The data independent cache bucket base class is completely inlined:
class Bucketb { public: Bucketb() { } // Other classes initialize buckets public: void Lock() { ++refcnt; } void Unlock() { --refcnt; } int IsLocked() { return refcnt; } void SetDirty() { Dirty = 1; } void SetDirty(VBDFile &f) { Dirty = 1; if(Address) Store(f); } int IsNull() { return Address == 0; } void Release() { refcnt = 0; } void Flush(VBDFile &f) { if (Address && Dirty) Store(f); } virtual void Fetch(VBDFile &f) = 0; // Depends on bucket data virtual void Store(VBDFile &f) = 0; // Depends on bucket data public: friend class Cacheb; friend class CachePtrb; friend void Report(Cacheb &c, int inspect=0); // Test purposes only protected: void MakeNull() { Address = 0; refcnt = 0; Dirty = 0; } protected: FAU Address; // Location of bucket's data in the file int refcnt; // How many cache pointers pointing to this bucket char Dirty; // True when data doesn't match what's in file Bucketb *Prior; // Pointer to adjacent buckets in the cache Bucketb *Next; // Most recently acquired bucket };
The Bucket class is derived from the Bucketb base class and implemented as template class or a non-template class. To avoid any portability problems with template classes, define the __NOT_USING_TEMPLATE_CLASS__ macro at compile time and directly code the Bucket class, the Cache class, and the CachePtr class for the data type that will be used in the "cactype.h" file. The cactype.h include file is used to set the data type for the non-template version of the Bucket class, the Cache class, and the CachePtr class. Each data type is set by macros that enable type definitions of the specific data type. This allows the cache classes to be used for several data types. Each type must be determined at compile time and only one type can be used for each compile. Define the __BTREE_MNODE__ at compile time to use B-tree multi-way nodes for the bucket data type.
The template version of the Bucket class uses a template parameter, as a base class to allow buckets to act likes tree nodes. The template parameter used as a base class links the cache and the tree nodes together via multiple inheritance.
template<class TYPE> class Bucket : public Bucketb, public TYPE { public: Bucket() { } // Call Bucketb and TYPE constructors public: virtual void Fetch(VBDFile &f) { // Fetch only the TYPE portion of the bucket f.Read((TYPE *)this, sizeof(TYPE), Address); Dirty = 0; } virtual void Store(VBDFile &f) { // Store only the TYPE portion of the bucket f.Write((TYPE *)this, sizeof(TYPE), Address); Dirty = 0; } };
Bucket::Fetch() exceptions:
CAccessViolation
CFileReadError
CFileNotReady
CFileSeekError
Bucket::Store() exceptions:
CEOFError
CFileWriteError
CFileNotWriteable
CFileSeekError
CChecksumError
CAccessViolation
CFileReadError
CFileNotReady
The Bucket::Fetch() and Bucket::Store() functions type cast the this pointer of the Bucket to calculate a pointer to the TYPE data. The compiler will add the appropriate offset to the this pointer of the object in question: (TYPE *)this(this + sizeof(Bucketb)). This pointer is passed when calling the file manager's VBDFile::Read() and VBDFile::Write() functions. This forces the Bucket::Fetch() and Bucket::Store() functions to use only the TYPE portion of the bucket and ignore the Bucketb::Address, Bucketb::refcnt Bucketb::Dirty, Bucketb::Prior, and Bucketb::Next data members inherited from the Bucketb base class.
In order for the TYPE parameter to work effectively, TYPE should not have any embedded pointers. Pointer values change each time the program is executed, therefore are meaningless if stored to disk. TYPE should not have any virtual functions because the virtual function table pointer will be stored to disk. The virtual function table pointer may not be valid from one program invocation to the next, therefore is meaningless if stored to disk. TYPE should not have a destructor other then an empty one. The cache classes are not set up to call the TYPE destructor when a bucket is overwritten. TYPE must have a default constructor because the Bucket class requires it to have one. TYPE cannot be a built-in type because built-in types cannot be used as base classes.
Two classes are used to implement the cache. The Cacheb class is used as a base class for the Cache class. The Cacheb base class incorporates functions that are independent of the bucket type. The cache is used to handle requests for file-based objects. The cache must determine whether an object is already loaded. If the object is not loaded the cache reserves a bucket and loads the object into memory. This cache design uses cache pointers to reference cache buckets directly, with each cache pointer being initialized after the bucket is reserved. The least-recently-reserved cache bucket is overwritten when the cache fills up. The Cacheb class organizes buckets in a circular doubly linked list.
Cacheb::Cacheb(Bucketb *b, int n, unsigned bkt_size) - Class constructor responsible organizing the buckets, allocated by the Cache constructor, in a circular doubly linked list.
virtual Cacheb::~Cacheb() - Class destructor provided for virtuality.
Bucketb *Cacheb::AcquireBkt() - Protected member function used to find the least recently reserved bucket that isn't locked by a non-zero reference count. The search starts with the least recently reserved bucket and moves to the most recently reserved bucket, stopping at the first unlocked bucket it finds. When an unlocked bucket is found, it is flushed and moved to the front of the list to become the most recently reserved bucket. A bucket is considered locked if one or more cache pointers reference it. If all buckets in the cache are locked, a cache full exception is thrown. Returns the pointer to bucket or returns zero if no bucket available.
Exceptions:
CCacheFull
CDanglingPtr
CEOFError
CFileWriteError
CFileNotWriteable
CFileSeekError
CChecksumError
CAccessViolation
CFileReadError
CFileNotReady
Bucketb *Cacheb::FindBkt(FAU Address) - Protected member function used to search for a bucket in the cache connected to file address. The search starts with the most recently reserved bucket. Returns a pointer to the bucket or zero if the bucket is not found.
void Cacheb::MoveToFront(Bucketb *b) - Protected member function used to move bucket to front of list, so that it is treated as the most recently reserved bucket.
void Cacheb::Connect(VBDFilePtr &fp) - Public member function used to connect the cache to a reference counted file manager pointer. This function clears the cache before connecting it to the file.
Exceptions:
CDanglingPtr
CEOFError
CFileWriteError
CFileNotWriteable
CFileSeekError
CChecksumError
CAccessViolation
CFileReadError
CFileNotReady
void Cacheb::Disconnect() - Public member function used to disconnect the cache from a reference counted VBD file manager pointer. This function clears the cache before disconnecting it from the file.
Exceptions:
CDanglingPtr
CEOFError
CFileWriteError
CFileNotWriteable
CFileSeekError
CChecksumError
CAccessViolation
CFileReadError
CFileNotReady
void Cacheb::Flush(int empty_bkts = 0) - Public member function used to flush all buckets in the cache to disk. By default this function will not empty the buckets in memory. If empty_bkts is true, then the buckets are emptied by making them null.
Exceptions:
CDanglingPtr
CEOFError
CFileWriteError
CFileNotWriteable
CFileSeekError
CChecksumError
CAccessViolation
CFileReadError
CFileNotReady
void Cacheb::Clear() - Public member function used to flush all buckets in the cache to disk, and then null them out.
Exceptions:
CDanglingPtr
CEOFError
CFileWriteError
CFileNotWriteable
CFileSeekError
CChecksumError
CAccessViolation
CFileReadError
CFileNotReady
virtual Bucketb *Cacheb::ReserveBkt(FAU Address, int ensure_loaded=1) - Public member function used to reserve a bucket for the file address if the bucket does not already exist. By default the data is loaded in memory if it is not already in the bucket. If ensure_loaded is false then the bucket data is not loaded in memory. The acquired bucket is moved to the front of the list to become the most recently used bucket. Returns a pointer to the bucket or zero if an error occurs.
Exceptions:
CCacheFull
CDanglingPtr
CEOFError
CFileWriteError
CFileNotWriteable
CFileSeekError
CAccessViolation
CFileReadError
CFileNotReady
CChecksumError
void Cacheb::FastBind() - Public member function used by the CachePtrb::Bind() function to count the number of fast binds. Fast binds take place when a cache pointer references or binds to a bucket that previously existed, was unbound, and did not change its state in memory or on disk. After a fast bind occurs, the bucket is simply locked by the cache pointer without having to reserve another bucket in the cache.
FAU Cacheb::GetHits() - Public member function used to get the number of hits. Hits represent the number of buckets that did not have to be acquired because the bucket for the file address already existed in the cache.
FAU Cacheb::GetMisses() - Public member function used to get the number of misses. Misses represent the number of cache buckets that had to be acquired because the bucket for the file address did not exist in the cache.
FAU Cacheb::GetFastBinds() - Public member function used to get the number of fast binds that have occurred. Fast binds take place when a cache pointer references or binds to a bucket that previously existed, was unbound, and did not change its state in memory or on disk. After a fast bind occurs, the bucket is simply locked by the cache pointer without having to reserve another bucket in the cache.
Bucketb *Cacheb::GetHead() - Public member function that returns the head of the circular doubly linked list or the most recently reserved bucket.
int Cacheb::GetBuckets() - Public member function that returns the size of the cache in buckets.
The Cache class is derived from the Cacheb base class and implemented as template class or a non-template class. To avoid any portability problems with template classes, define the __NOT_USING_TEMPLATE_CLASS__ macro at compile time and directly code the Cache class, the Bucket class, and the CachePtr class for the data type that will be used in the "cactype.h" file. The cactype.h include file is used to set the data type for the non-template version of the Cache class, the Bucket class, and the CachePtr class. Each data type is set by macros that enable type definitions of the specific data type. This allows the cache classes to be used for several data types. Each type must be determined at compile time and only one type can be used for each compile. Define the __BTREE_MNODE__ at compile time to use B-tree multi-way nodes for the cache data type.
The Cache class adds two functions to handle the allocation and de-allocation of buckets. Buckets are stored in an array and organized as a circular doubly linked list by the Cacheb class. The cache size, in buckets, is set when a Cache object is constructed.
template<class TYPE> class Cache : public Cacheb { public: // n represents the cache size in buckets Cache(int n) : Cacheb(buckets = new Bucket<TYPE>[n], n, sizeof(Bucket<TYPE>)) { } virtual ~Cache() { if (fptr) Flush(); Disconnect(); delete[ ] buckets; } protected: Bucket<TYPE> *buckets; };
Cache::~Cache() exceptions:
CDanglingPtr
CEOFError
CFileWriteError
CFileNotWriteable
CFileSeekError
CChecksumError
CAccessViolation
CFileReadError
CFileNotReady
Cache pointers are used to work in conjunction with the reference counted cache buckets. Each cache pointer stores the file address of the object being pointed to, a pointer to the bucket containing the memory buffer of the object, and a pointer to the cache it is connected to.
The CachePtrb class is used as a data independent base class for the CachePtr class. The CachePtrb base class incorporates functions that are independent of the bucket type.
CacehPtrb::~CachePtrb() - Public class destructor
CacehPtrb::CachePtrb(Cacheb &c, FAU p) - Protected general class constructor use to connect the cache pointer class to a pre-allocated cache. The cache pointer stays unbound until needed.
CacehPtrb::CachePtrb(const CachePtrb &c) - Protected class copy constructor. After coping constructing the object the destination cache pointer will not be bound until needed, even if source is bound.
void CacehPtrb::operator=(const CachePtrb &c) - Private class assignment operator implemented to disallow assignment.
void CacehPtrb::Grab() - Public member function used to bind the cache pointer to a bucket containing data stored at a file address if it is not already bound.
Exceptions:
CNullPtr
CDanglingPtr
CEOFError
CFileWriteError
CFileNotWriteable
CFileSeekError
CAccessViolation
CFileReadError
CFileNotReady
CChecksumError
void CacehPtrb::Release() - Public member function used to signal to the bucket that the cache is through with it by unlocking the bucket if it is bound.
void CacehPtrb::NotifyUseOf() - Public member function used to move the referenced bucket to the front of the cache queue (most recently used.)
CacehPtrb::operator __LWORD__() const - Conversion function used to return the file address of the bucket data.
void CacehPtrb::Copy(const CachePtrb &c) - Protected member function used to copy one cache pointer into another. The destination cache pointer is not bound, even if the source cache pointer is bound.
Bucketb *CacehPtrb::Alloc(size_t n, FAU p=0) - Protected member function used to allocate a specified number of bytes (presumably the size of the data in the bucket) at a specified address in the cache's file, and set up a cache bucket to support it. If address does not equal zero, it means the room has already been allocated. A call to the Cacheb::ReserveBkt() function will setup bucket's address, cache pointer and reference count, but does not load any data. Returns a pointer to the bucket or zero if it could not allocate space for the bucket.
Exceptions:
CCacheFull
CDanglingPtr
CEOFError
CFileWriteError
CFileNotWriteable
CFileSeekError
CAccessViolation
CFileReadError
CFileNotReady
CChecksumError
void CacehPtrb::Delete(unsigned n) - Protected member function used to de-allocate the file data pointed to by this cache pointer, of the specified size. De-allocation only takes place if this cache pointer is pointing to a bucket bound by no one else. The cache pointer is unbound afterwards and both the bucket and cache pointer is made null.
Exceptions:
CDanglingPtr
CSyncError
CAccessViolation
CFileReadError
CFileNotReady
CEOFError
CFileWriteError
CFileNotWriteable
CFileSeekError
CChecksumError
void CacehPtrb::Bind() - Protected member function used to bind a cache pointer to a bucket containing data stored at the file address. This function checks for null buckets and null file addresses. It will throw a null pointer exception if an attempt is made to bind to a null bucket or address. The null pointer test is only performed when binding to a bucket and not on every cache pointer access. The exception will only be generated if a non-existent object is encountered during the bind. If the bucket is not a null value, this function will test the file address of the bucket to see if the bucket data is pointing to the correct file address. This means the bucket was unlocked (no longer being used) and has not been altered since. This condition is referred to as a fast bind because no actions are taken other then simply locking the bucket (marking it as being used.) If the bucket is a null value or points to the wrong file address, a call to the Cacheb::ReserveBkt() function is made to reserve a new bucket in the cache and lock it.
Exceptions:
CNullPtr
CDanglingPtr
CEOFError
CFileWriteError
CFileNotWriteable
CFileSeekError
CAccessViolation
CFileReadError
CFileNotReady
CChecksumError
The CachePrt class is derived from the CachePtrb base class and implemented as template class or a non-template class. To avoid any portability problems with template classes, define the __NOT_USING_TEMPLATE_CLASS__ macro at compile time and directly code the CachePtr class, the Bucket class, and the Cache class for the data type that will be used in the "cactype.h" file. The cactype.h include file is used to set the data type for the non-template version of the CachePtr class, the Bucket class, and the Cache class. Each data type is set by macros that enable type definitions of the specific data type. This allows the cache classes to be used for several data types. Each type must be determined at compile time and only one type can be used for each compile. Define the __BTREE_MNODE__ at compile time to use B-tree multi-way nodes for the cache data type.
The main purpose of the CachePtr class is to take a file address and make it appear as a pointer to a TYPE object. This accomplished by overloading the * and -> operators. These overloaded operators bind the bucket (mark it as being used) and cast the generic Bucketb pointer to the Bucket type. Due to multiple inheritance, these overloads give cache pointer objects access to the members of the Bucketb class and of TYPE.
template<class TYPE> class CachePtr : public CachePtrb { public: CachePtr(Cache<TYPE> &c, FAU p = 0) : CachePtrb(c, p) { } CachePtr(const CachePtr<TYPE> &c) : CachePtrb(c) { } void operator=(const CachePtr<TYPE> &c) { Copy(c); } void operator=(__LWORD__ p) { Release(); Address = p; } public: void Delete() { CachePtrb::Delete(sizeof(TYPE)); } Bucket<TYPE> &operator*() { // Bind if needed and return a reference to this bucket if(!bound) Bind(); return *(Bucket<TYPE> *)bkt; } Bucket<TYPE> *operator->() { // Bind if needed and return a pointer to this bucket if (!bound) Bind(); return (Bucket<TYPE> *)bkt; } friend void *operator new(size_t n, CachePtr<TYPE> &cp, __LWORD__ p = 0) { return (TYPE *)((Bucket<TYPE> *)(cp.Alloc(n, p))); } };
The new operator is overloaded to allocate a new object to reside at address "p" in the file. If "p" is a non-zero value, the object is presumed to already been allocated. By default "p" will always equal zero. The allocation takes place by calling the CachePtrb::Alloc() function, which not only allocates the object on disk, but also reserves a cache bucket for it in memory. A Bucketb pointer to this bucket is returned by CachePtrb::Alloc(), which is typecast as a Bucket<TYPE> pointer. Then a typecast to a TYPE pointer, retrieves a pointer to the bucket's data through multiple inheritance. The TYPE pointer is returned from the new operator, and is passed to the constructor as the this pointer of the TYPE object being constructed.
Exceptions generated by CachePrtb:: function calls:
CCacheFull
CNullPtr
CDanglingPtr
CEOFError
CFileWriteError
CFileNotWriteable
CFileSeekError
CAccessViolation
CFileReadError
CFileNotReady
CChecksumError
The Report() function is a standalone function used to display cache performance statistics to the console. It is only used for test purposes when developing file based applications that use the cache classes.
void Report(Cacheb &c, int full_report) - Displays cache performance statistics to the console. A full report will display the bucket's address, reference count, and dirty flag for every cache bucket allocated. All reports will display the number of buckets in use, the number of buckets reserved, and a percentage representing the total number of fast binds, hits, and misses.
FAST BINDS:
Fast binds take place when a cache pointer references or binds to a bucket that previously existed, was unbound, and did not change its state in memory or on disk. After a fast bind occurs, the bucket is simply locked by the cache pointer without having to reserve another bucket in the cache.
HITS:
Hits represent the number of buckets that did not have to be acquired because the bucket for the file address already existed in the cache.
MISSES:
Misses represent the number of cache buckets that had to be acquired because the bucket for the file address did not exist in the cache.
The cache classes can generate any of the following exceptions if the CPP_EXCEPTIONS macro is defined in the EHandler class at compile time. NOTE: The appropriate try and catch statements within the application must handle each exception. If the CPP_EXCEPTIONS macro is not defined, the global error hander will signal that an exception has occurred and will terminate the program as defined in the EHandler::Terminate() function.
CCacheFull - this exception is thrown if all the cache buckets are locked, indicating that no more buckets can be acquired until another bucket is released.
CNullPtr - this exception is thrown if an attempt is made to access a null file address or null memory location.
CAccessViolation - this exception is thrown if an "end of file" error occurs during multiple file access. This exception was added to tell the application that the file has grown in size but the EOF marker has not.
CDanglingPtr - this exception is thrown if there are any dangling references to the object pointer, indicating that this object is still being used by another object.
CEOFError - this exception is thrown if an "end of file" error is encountered.
CFileNotReady - this exception is thrown if the file is not ready for reading.
CFileNotWriteable - this exception is thrown if the file cannot be written to.
CFileReadError - this exception is thrown if an error occurs while reading from the file.
CFileSeekError - this exception is thrown if an error is encountered during a seek operation.
CFileWriteError - this exception is thrown if the number of bytes written do not match the number of bytes requested to write.
CSyncError - this exception is thrown if the block header's check word cannot be read. This indicates a file synchronization error, meaning the VBD file manager is not reading a valid variable data block or the block is corrupt.
CChecksumError - this exception is thrown if a bit error occurs when writing to disk. A 32-bit CRC checksum based on the Ethernet polynomial of 0x4C11DB7 is calculated when any data is written to the VBD file. The calculated checksum is then compared to data actually stored on disk. If the calculated checksum does not match the actual checksum, a bit error has occurred during data storage.